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ABSTRACT 


Let  X be  an  observation  from  a p-variate  normal  distribution  (p  > 3) 
with  mean  vector  0 and  unknown  positive  definite  covariance  matrix  It 
is  desired  to  estimate  9 under  the  quadratic  loss  L(5,0,$:)  = (6-e)tQ(6-0)/tr(Qj;) , 
where  Q is  a known  positive  definite  matrix.  Estimators  of  the  following 
form  are  considered: 

6C(X,W)  = (I  - caQ'1W‘1/(XtW'1X))  X , 

where  W is  a p*p  random  matrix  with  a Wishart  (|,n)  distribution  (independent 
of  X) , a is  the  minimum  characteristic  root  of  (QW)/(n-p-l)  and  c is  a positive 
constant.  For  appropriate  values  of  c,  f,C  is  shown  to  be  minimax  and  better 
than  the  usual  estimator  <$°(X)  = X. 
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1.  Introduction 

Assume  X = (^.....X^)1  is  a p-dimensional  random  vector  (p>3)  which  is 
normally  distributed  with  mean  vector  9 = (0j,...,0  )*  and  positive  definite 
covariance  matrix  It  is  desired  to  estimate  6 by  an  estimator  6 = (6j,...,6  )* 
under  the  quadratic  loss 

L(M,$)  = (5-0)tQC«-0)/tr(Qt)  , 


where  Q is  a positive  definite  (p*p)  matrix. 

The  usual  minimax  and  best  invariant  estimator  for  0 is  6^(X)  = X.  Since 
Stein  (1955)  first  showed  that  6^  could  be  improved  upon  for  Q=j)=I  (the 

t 

identity  matrix),  a considerable  effort  by  a number  of  authors  (see  the 

references)  has  gone  into  finding  significant  improvements  upon  6^.  For  the 

most  part  these  efforts  have  been  directed  towards  the  problems  where  either 

$ was  known  (or  known  up  to  a multiplicative  constant)  or  where  Q=t_1  (a  rather 

unrealistic  assumption).  For  unknown  $ only  a few  special  situations  have 

been  considered.  Berger  and  Bock  (1976a)  arid  (1976b)  found  minimax  estimators 

(better  than  6®)  for  problems  in  which  $ was  an  unknown  diagonal  matrix  or 

could  be  reduced  to  one.  Gleser  (1976)  found  minimax  estimators  under  the 

assumption  that  the  characteristic  roots  of  Q$  have  a known  lower  bound. 

In  this  paper  the  fundamental  problem  of  completely  unknown  $ will  be 

considered.  It  will  be  assumed  that  an  estimate  W of  $ is  available,  where 

W has  a Wishart  distribution  with  parameter  $ and  n degrees  of  freedom,  and 

is  independent  of  X.  Let  ch  . (A)  denote  the  minimum  characteristic  root  of 

min 

A,  and  define 

a = [ (n-p-l)chmax(Q_1W~1)  ] -1  = ch^JQW)/ (n-p-1)  . 

The  estimators  considered  in  this  paper  will  be  of  the  form 

(1.1)  <5C(X,W)  = (I  - 1 )X  , 

xVAx 

where  c is  a positive  constant.  For  known  estimators  of  this  form  (with  (n-p-l)W-* 


r 


replaced  by  $ *)  were  shown  to  be  minimax  in  Bock  (1974)  and  Berger  (1976b), 

providing  0 < c f2(p-2).  In  this  paper  6 is  shown  to  be  minimax  for 

0 < c < c , 
n.p 

where  the  cn  are  solutions  to  equation  (2.17),  and  are  numerically  calcu- 
lated in  Table  1 for  certain  values  of  n and  p. 


Table  1 
Values  of  c 

n,p 


3 


Q 

2.  Minimaxity  of  6 

The  notation  E (Z ) will  be  used  for  the  expectation  of  Z.  Subscripts  on 
E will  refer  to  parameter  values,  while  superscripts  on  E will  refer  to  the 
random  variable:,  with  respect  to  which  the  expectation  is  to  be  taken.  When 
obvious,  subscripts  and  superscripts  will  be  omitted. 

For  an  estimator,  6,  define  the  risk  function 

R(6,6,t)  = E*’J  [L(6(X,W),0,i)]  . 

For  notational  convenience  define  n*  = (n-p-1)  and 


Ac  = Ac(0,t)  = tr(Q*)[R(SC,0,$)  - R(6°,0,t)]  . 

c 0 

The  estimator  6 is  clearly  minimax  (and  as  good  as  or  better  than  6 ) pro- 
viding Ac(9,t)  < 0 for  all  0 and 

c 

Expanding  the  quadratic  loss  L for  6 verifies  that 


(2.1) 


ca(X-0)tW~1X 

xtw‘1x 


] ♦ E[ 


2 2 t -1  -1  -1 

c a X W Q W AX  , 

t -1  2 ‘ 

(X  W X) 


As  in  Berger  (1976b)  an  integration  by  parts  with  respect  to  the  gives 

(X-9)tW~1X  , _ . tr(frv-1)  2XtW~1jw'1X  , 

t -1  J “ t -1  “ t -1  2 ' 

X W AX  XXW  (XlW  AX) 


Thus  (2.1)  becomes 


(2.2)  Ac 


Note  that 


-E[  c;-  - (2tr(tw_1) 

(XV*X) 


4XtW'1tw'1X  _ caXVV  VXX 


t,.,-l„-l,.,-lx 


xtw‘1x 


xtw'1x 


>] 


aXtW~1Q'1W~1X  q_ 


XtW_1X  ch  . (QW) 

min'y  1 


l 

n* 


Using  this  in  (2.2)  gives 


(2.3)  A < -F.[  — t--r 


( 2tr(tw1) 


fx  W X) 


4XtW~1^\'~1X  £ 


XtW"1X 


n* 
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In  this  expression,  perform  the  change  of  variables 

y = t v = 

Note  that  V is  now  Wishart  with  parameter  I and  n degrees  of  freedom,  and  that 
a ■ chm^n(^Q^V)/n*  Clearly  (2.3)  becomes 


ac  ,,,-K  4YtV  2Y  c ln 

(2.4)  A < -E[  — r — -x — { 2tr (V  ) — =-  - -„}] 

C fylv  lyj  Y V ly 


For  convenience,  define 


B = chm.n(Qt)  , Z - Y/|Y|  , and  $ = fQ^/B 

Note  that  ch  . (t*)  = 1.  Lirie  (2.4)  can  then  be  rewritten 
min  T 

(2.5,  5 -fe  EYr  Evt  (2t rtv-1,  - ^ - £.,)]  . 


|V| 


(ZtV‘1Z) 


Z"V  *z 


To  show  that  A < 0 it  suffices  to  show  for  all  Z€u  (the  unit  p-sphere)  and 
c P 

all  t*  with  = 1»  that  the  following  inequality  holds: 


(2.6) 


,,  ch  . (fc*V) 
pV  r mmVT 

fc  * t -1 

(ZLV  AZ) 


[2tr  (V1)  - 42V2Z  c 


zV'z  ' "•)):  ° 


(Note  that  the  distribution  of  V does  not  depend  on  Z or  on  t*.) 

Let  r be  a p*p  orthogonal  matrix  such  that  rZ  = (1,0, . . . ,0)*.  Define 

V*  = rVr*  and  = rt*rt.  Clearly  V*  is  also  Wishart  (I)  and  ch  . (^_)  = 1. 

T/  T min  - 


-1 


For  convenience,  let  v^  denote  the  (1,1)  element  of  (V*)  , V£  denote  the  (1,1) 

_2 

element  of  (V*)  , and  let 


p(V*)  = [2tr{(V*)'1)-  4 v2/v1]  . 

It  is  straightforward  to  verify  that  under  the  above  change  of  variables  for 
V,  (2.6)  becomes 


i:.T) 


ch  . (M*) 

i — — f o(v*)  - S.J)  :•  o 


5 


Since  ch  . (t,)  = 1,  it  is  clear  that 
min  L 


<2-8>  ch«in'tzv*>  2 ch„i„(V*>  • 


Also  if  a€U  (i.e.  I a I = 1)  then 
P 

'hmin<tzV*’  5 “Vvt/a  . 

Choosing  a to  be  a*,  the  characteristic  vector  of  the  root  1 of  > it  follows 
that 


(2.9)  chmin(tzv*)  * (a^Va1  . 

For  convenience  define 

= (V*:  p(V*)<  c/nM  , 

let  ft  denote  the  complement  of  ft  , and  let  I (V*)  denote  the  usual  indicator 
C C f\ 

function  on  A.  Using  (2.8)  and  (2.9)  it  then  follows  that  (2.7)  will  hold 
(and  6 will  be  minimax)  if 

(2.10)  EV*  [P(V*)  - (V*)  ♦ - min(  } C 


v (P(V*)  - (V*)}  > 0 

V1  c 


for  all  a*  £ U . 

P 


To  simplify  this  expression  further,  let 


1 0...0> 

T = I 0 

S 

0 


where  S is  a (p-l)*(p-l)  orthogonal  matrix  such  that 

Ta1  = (b,  (l-b2)*p 0)*  (-1  < b < 1). 

In  (2.10),  performing  the  change  of  variables  V = TV*?* ■ (again  Wishart  (I)) 
then  gives  as  the  condition  for  minimaxity 

(2.11)  F.V  { CL^^VCTa1,)  [p(v)  _ c jj  (V)  + !!l«in(V.).  [p(V)  - *]l-  (V) } > 0 
V1  c V1  n c 

for  all  a*  £ U . 

P 
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(Note  that  Vj  = (V*'1^  = (TtV"1T)u  = (V_1)11  and  likewise  v2  = (V*2)u.) 
The  inequality  (2.11)  can  be  rewritten 


(2.12)  c < 


n«EV  { p(V)v1~1[(Ta1)tV(Ta1)I^(V)  + chmin  (V)  I-  (V)  ] } 
EV  {,1-1[(Ta1)tV(T,1)Inc(V)  . 


Note  that 


(TaVvda1)  = b2(Vn-V22)  +b(l-b2)£(V12+V21)  ♦ V22  . 


Hence  defining 


V*=>  - EV(p(V)v1-1[V22l!)c(V)  . chmin(V)I-(V)]} 
(c)  = EV{p(V)v1'1(Vn-V22)I^(V)  } , 

t2(c)  = EV{p(V)v1"1(V12+V21)If2  (V)} 
i,'(c)  = EV{v*1[V„I_  (V)  ♦ ch  . (V)It  (V)]}  , 


1 1 22  flc 


min 


Tj'fc)  = EV{v1’1(V1i-V22)ij^(v)}  , and 

T,'(C)  = EV{v1'1(V12  ♦ V21)IQc(V)}  , 

it  is  clear  that  (2.12),  the  condition  for  minimaxity,  can  be  rewritten 


(2.13) 


ntT()(c)  + ^(Ob2  + T2(c)b(l-b2)'=J 
T0'(C)  + Tj'fcjb2  ♦ T2'b(l-b2)3 


for  all  -1  s b < 1.  Finally,  defining  b = (b,(l-b 


A(c)  = 


Tq(c)+Tj (c)  t2(c)/2 


t2(c)/2 


line  (2. 13)  becomes 


(2.14) 


T0(c) 


c .n^ACQb 

btB(c)b 


, and  B(c)  = 


T0' Cc)+Ti' (c)  T2 ' (c)/2 


t2’(c)/2 


Tq'CO 


V w* 
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j 


i 
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Now  for  fixed  b,  the  nonnegative  solutions  to  (2.14)  lie  in  an  interval 
0 s c < eg.  This  can  most  easily  be  seen  by  looking  at  (2.11)  (an  expression 
equivalent  to  (2.14))  and  noting  that  the  left  hand  side  is  decreasing  in  c. 
Thus  defining 

Cn,p  =-lifi!l  Cb  ' 

it  follows  that  if 


(2.15) 


0 < c < c 


n,p 


then  (2.14)  will  be  satisfied  for  all  -1  s b < 1,  and  hence  5C  will  be  minimax. 
To  get  a more  explicit  equation  for  c^  .note  from  equation  (2.12) 

l 

(an  equivalent  expression  to  (2.14))  that  B(c)  is  positive  definite.  Hence 
if  (2.14)  holds  for  all  -1  < b < 1,  then 


(2.16)  c ^n*chmin[B(c)'1A(c)]  . 

Thus  (2.15)=»  (2.14)  for  all  -1  < b < 1 =»  (2.16).  It  is  also  clear  that  the 

reverse  implications  hold,  so  that 

{ c : 0 < c < c } = {c:  c <n*ch  . [B(c) -1A(c) ] } 

n.p  min 

It  is  also  easy  to  check  that 

c =n*ch  . [B(c  ) _1A(c  )] 

n.p  min1  n.p'  v n,p'J  ’ 

c <n*ch  . [B(c)~1A(c)l  if  0 < c < c , 

min1  - n.p 

and  . 

c >n*ch„in1B<<:!'  A(c)1  if  c ‘ cn,p  • 

Hence  c is  the  unique  solution  to 
n.p 

(2.17)  c =n*ch  . (B(c) _1A(c))  . 


As  there  appeared  to  be  little  hope  of  analytically  obtaining  solutions 
to  (2.17),  the  computer  was  used  to  numerically  compute  the  solutions.  For 
a given  n and  p,  the  values  of  the  t^(c)  and  t^'(c)  (and  hence  A(c)  and  B(c)) 
were  calculated  by  monte  carlo  methods  using  4000  generations  of  V (for  n=8)  to 
1000  generations  of  V (for  n=30) . (Unfortunately  a larger  number  of  generations 


,6  T5  ' « qii. 
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could  not  be  used  due  to  the  considerable  expense  of  generating  V and  per- 
forming the  calculations  involving  V * . ) The  resulting  estimated  solutions, 

c , to  (2.17)  were  then  found  and  are  listed  in  Table  1.  The  standard 
n,p 

deviations  of  these  simulated  solutions  ranged  from  about  .02  (for  p=3)  to 
about  .1  (for  n-p  = 4). 

3.  Comments 

Q 

1.  The  values  c are  not  the  largest  values  of  c for  which  6 is 

n.p 

minimax.  Approximations  were  made  in  the  proof  (lines  (2.8)  and  (2.9))  which 

resulted  in  a smaller  than  necessary  upper  bound.  If  one  could  somehow 

determine  the  "least  favorable"  matrix  ^ in  (2.7),  the  approximations  could 

be  eliminated  and  the  largest  possible  value  of  c obtained. 

c 

2.  The  estimators  6 have  a singularity  as  X-*0.  There 

are  numerous  ways  of  eliminating  the  singularity,  one  of  the  simplest  being 
used  in  the  following  estimator: 

6*c(x  w)  = (i  min(n*xtw"lx»c)aQ"lwl  )X 

xtw'1x 

Through  analogy  with  the  known  t situation,  it  seems  quite  likely  that  6*c  is 
itself  ninimax  (for  0 < c < c^  ) and  considerably  better  than  6 . 

3.  If  the  linear  restriction  R6=r°  is  thought  to  hold,  where  R is 

an  (mxp)  matrix  of  rank  m and  r^*  is  an  (m*l)  vector,  then  the  estimators 
c c 

5 and  6*  can  be  modified  so  that  their  regions  of  significant  risk 

improvement  coincide  with  the  linear  restriction.  Indeed,  defining 

Y = RX  - r°,  W*  = RWRt,  and  o*  = ch  . gkQ"1Rt)"1W*]/(n-m-l),  Theorem  2 of 

min  ^ 

Berger  and  Bock  (1976b)  can  be  used  to  show  that 

5^  = X -ca*Q‘1Rt(W*)"1Y/[Yt(W*)_1Y] 

is  minimax  if  0 < c < c . The  appropriate  modification  of  6*C  is  the 
— — n , m 

above  estimator  with  c replaced  by  mint (n-m-1) Yt (W*) ’ * Y,  c). 


4.  If  (Qt)  has  a characteristic  root  considerably  smaller  than  the  other 
characteristic  roots,  then  chm^n(Qt)  will  be  small  compared  to  tr(Q$).  From 

the  definition  of  ^(6,$)  and  line  (2.2),  it  is  apparent  that  the  improvement 

c c 

obtained  in  using  6 will  be  quite  small.  The  estimator,  6 , will  therefore 

perform  best  when  (Q$)  has  no  exceptionally  small  roots.  (If  it  is  suspected 

that  a coordinate  might  give  rise  to  an  exceptionally  small  root  of  (Q$) , 

it  would  probably  pay  to  eliminate  that  coordinate  in  the  construction  of  6 , 

providing  of  course  that  there  are  at  least  three  coordinates  left.) 
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